On building phonetically and prosodically rich speech corpus for text-to-speech synthesis

نویسندگان

  • Jindrich Matousek
  • Jan Romportl
چکیده

This paper proposes a way of preparing and recording a speech corpus for unit selection text-to-speech speech synthesis driven by symbolic prosody. The research is focused on a phonetically and prosodically rich sentence selection algorithm. Symbolic description on a deep prosody level is used to enrich the phonetic representation of sentences (by respecting the prosodeme types phones appear in). The resulting algorithm then selects sentences with respect to both phonetic and prosodic criteria. To cover supra-sentential prosody phenomena, paragraphs were selected at random and recorded as well. The new speech corpus can be utilised in unit selection speech synthesis and also for training a data-driven prosodic parser.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Building of a Speech Corpus Optimised for Unit Selection TTS Synthesis

The paper deals with the process of designing a phonetically and prosodically rich speech corpus for unit selection speech synthesis. The attention is given mainly to the recording and verification stage of the process. In order to ensure as high quality and consistency of the recordings as possible, a special recording environment consisting of a recording session management and “pluggable” ch...

متن کامل

Recording and Annotation of Speech Corpus for Czech Unit Selection Speech Synthesis

The paper gives a brief summarisation of preparation and recording of a phonetically and prosodically rich speech corpus for Czech unit selection text-to-speech synthesis. Special attention is paid to the process of two-phase orthographic annotations of recorded sentences with regard to their coherence.

متن کامل

A set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese

This paper presents a set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese. A large speech corpus produced by a single speaker is used, and the speech output is synthesized from waveform units of variable lengths, with desired linguistic properties, retrieved from this corpus. Detailed methodologies were developed for designing “phonetically rich” and “prosodically ric...

متن کامل

Prosody annotation for corpus based speech synthesis

The paper concerns prosody annotation especially for application in a corpus based speech synthesis. In order to establish the rules of automatic intonation modelling, phonetically labeled speech database of 4 hours has been perceptually and acoustically analyzed. The speech material included different text types and prosodically rich phrases. The annotation of the speech database consists in p...

متن کامل

Efficient Diphone Database Creation for MBROLA, a Multilingual Speech Synthesiser

Diphone synthesis is a convenient way for testing phonetic models of human speech. It allows easy manipulation of duration and pitch, therefore it is used not only for general intonation contour evaluation, but also for expressive speech synthesis. The main advantage of using MBROLA [11][9],[12],[13] is the fact that not all the diphones need to be contained in the voice to test speech models. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006